Search CORE

66 research outputs found

Klarigi:characteristic explanations for semantic biomedical data

Author: Ball Simon
Fanning Hilary
Gkoutos Georgios V
Hoehndorf Robert
Karwath Andreas
Pendleton Samantha C
Russell Sophie
Schofield Paul N
Slater Luke T
Williams John A
Publication venue: 'Elsevier BV'
Publication date: 01/02/2023
Field of study

University of Birmingham Research Portal

Desiderata for the development of next-generation electronic health record phenotype libraries

Author: Chapman M
Curcin V
Denaxas S
Gao C
Gkoutos GV
Jefferson E
Karwath A
Mumtaz S
Pacheco JA
Parkinson H
Rasmussen LV
Richesson RL
Thayer D
Publication venue
Publication date: 11/09/2021
Field of study

Background High-quality phenotype definitions are desirable to enable the extraction of patient cohorts from large electronic health record repositories and are characterized by properties such as portability, reproducibility, and validity. Phenotype libraries, where definitions are stored, have the potential to contribute significantly to the quality of the definitions they host. In this work, we present a set of desiderata for the design of a next-generation phenotype library that is able to ensure the quality of hosted definitions by combining the functionality currently offered by disparate tooling. Methods A group of researchers examined work to date on phenotype models, implementation, and validation, as well as contemporary phenotype libraries developed as a part of their own phenomics communities. Existing phenotype frameworks were also examined. This work was translated and refined by all the authors into a set of best practices. Results We present 14 library desiderata that promote high-quality phenotype definitions, in the areas of modelling, logging, validation, and sharing and warehousing. Conclusions There are a number of choices to be made when constructing phenotype libraries. Our considerations distil the best practices in the field and include pointers towards their further development to support portable, reproducible, and clinically valid phenotype design. The provision of high-quality phenotype definitions enables electronic health record data to be more effectively used in medical domains

UCL Discovery

Desiderata for the development of next-generation electronic health record phenotype libraries

Author: Chapman Martin
Curcin Vasa
Denaxas Spiros
Gao Chuang
Gkoutos Georgios V.
Jefferson Emily
Karwath Andreas
Mumtaz Shahzad
Pacheco Jennifer A.
Parkinson Helen E.
Rasmussen Luke V.
Richesson Rachel L.
Thayer Dan
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2021
Field of study

BackgroundHigh-quality phenotype definitions are desirable to enable the extraction of patient cohorts from large electronic health record repositories and are characterized by properties such as portability, reproducibility, and validity. Phenotype libraries, where definitions are stored, have the potential to contribute significantly to the quality of the definitions they host. In this work, we present a set of desiderata for the design of a next-generation phenotype library that is able to ensure the quality of hosted definitions by combining the functionality currently offered by disparate tooling.MethodsA group of researchers examined work to date on phenotype models, implementation, and validation, as well as contemporary phenotype libraries developed as a part of their own phenomics communities. Existing phenotype frameworks were also examined. This work was translated and refined by all the authors into a set of best practices.ResultsWe present 14 library desiderata that promote high-quality phenotype definitions, in the areas of modelling, logging, validation, and sharing and warehousing.ConclusionsThere are a number of choices to be made when constructing phenotype libraries. Our considerations distil the best practices in the field and include pointers towards their further development to support portable, reproducible, and clinically valid phenotype design. The provision of high-quality phenotype definitions enables electronic health record data to be more effectively used in medical domains

Aberdeen University Research

University of Birmingham Research Portal

PubMed Central

UCL Discovery

Cronfa at Swansea University

University of Dundee Online Publications

A discriminative method for family-based protein remote homology detection that combines inductive logic programming and propositional models

Author: A Andreeva
A Ben-Hur
A Karwath
A Karwath
A Shah
Alessandra Carbone
B Liu
B Qian
B Webb-Robertson
C Ferreira
C Leslie
D Higgins
F Wilcoxon
G Yona
Gerson Zaverucha
H Rangwala
H Saigo
J Bernardes
J Davis
J Gough
J Quinlan
J Soeding
J Weston
Juliana S Bernardes
L De Raedt
L Dehaspe
L Liao
N Shan-Hwei
Q Dong
Q Su
R Agrawal
R Hughey
R King
R King
R Kuang
R Sadreyev
S Altschul
S Altschul
S Brenner
S Eddy
S Eddy
S Kawashima
S Lee
T Handstad
T Jaakkola
T Lingner
U Syed
V Alexandrov
V Atalay
Y Hou
Y Hou
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Remote homology detection is a hard computational problem. Most approaches have trained computational models by using either full protein sequences or multiple sequence alignments (MSA), including all positions. However, when we deal with proteins in the "twilight zone" we can observe that only some segments of sequences (motifs) are conserved. We introduce a novel logical representation that allows us to represent physico-chemical properties of sequences, conserved amino acid positions and conserved physico-chemical positions in the MSA. From this, Inductive Logic Programming (ILP) finds the most frequent patterns (motifs) and uses them to train propositional models, such as decision trees and support vector machines (SVM). Results We use the SCOP database to perform our experiments by evaluating protein recognition within the same superfamily. Our results show that our methodology when using SVM performs significantly better than some of the state of the art methods, and comparable to other. However, our method provides a comprehensible set of logical rules that can help to understand what determines a protein function. Conclusions The strategy of selecting only the most frequent patterns is effective for the remote homology detection. This is possible through a suitable first-order logical representation of homologous properties, and through a set of frequent patterns, found by an ILP system, that summarizes essential features of protein functions.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

HAL-Inserm

PubMed Central

The development of a knowledge base for basic active structures: an example case of dopamine agonists

Author: A Böcker
A Böcker
A Böcker
A Inokuchi
A Inokuchi
A Inokuchi
A Karwath
CL Russom
CW Gini
G Harper
G Harper
G Klopman
G Klopman
G Klopman
H Ohtaka
Hiroshi Horikawa
JL Medina-Franco
K Tsunoyama
Masumi Yamakawa
Norihito Ohmori
RD King
S Fujishima
S Kramer
Sachio Mori
Satoshi Fujishima
T Okada
T Okada
T Okada
T Okada
T Okada
T Okada
Takashi Okada
Taketo Hayashi
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Chemical compounds affecting a bioactivity can usually be classified into several groups, each of which shares a characteristic substructure. We call these substructures "basic active structures" or BASs. The extraction of BASs is challenging when the database of compounds contains a variety of skeletons. Data mining technology, associated with the work of chemists, has enabled the systematic elaboration of BASs. Results This paper presents a BAS knowledge base, BASiC, which currently covers 46 activities and is available on the Internet. We use the dopamine agonists D1, D2, and Dauto as examples and illustrate the process of BAS extraction. The resulting BASs were reasonably interpreted after proposing a few template structures. Conclusions The knowledge base is useful for drug design. Proposed BASs and their supporting structures in the knowledge base will facilitate the development of new template structures for other activities, and will be useful in the design of new lead compounds via reasonable interpretations of active structures.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

DESY

CheS-Mapper - Chemical Space Mapping and Visualization in 3D

Author: A Maunz
Andreas Karwath
B Hardy
C Steinbeck
DH Fisher
DK Agrafiotis
E Papa
G Patlewicz
J Oksanen
JD Leeuw
JJW Sammon
KR Przybylak
L van der Maaten
M Hall
M Seeland
M Wawer
Martin Gütlein
N Jeliazkova
N O'Boyle
NL Allinger
P Langfelder
R Development Core Team
R Guha
S Dasgupta
Stefan Kramer
Susan Schiffman FWY M Lance Reynolds
T CaliÅ„ski
TA Halgren
TJ Hou
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Analyzing chemical datasets is a challenging task for scientific researchers in the field of chemoinformatics. It is important, yet difficult to understand the relationship between the structure of chemical compounds, their physico-chemical properties, and biological or toxic effects. To that respect, visualization tools can help to better comprehend the underlying correlations. Our recently developed 3D molecular viewer CheS-Mapper (Chemical Space Mapper) divides large datasets into clusters of similar compounds and consequently arranges them in 3D space, such that their spatial proximity reflects their similarity. The user can indirectly determine similarity, by selecting which features to employ in the process. The tool can use and calculate different kind of features, like structural fragments as well as quantitative chemical descriptors. These features can be highlighted within CheS-Mapper, which aids the chemist to better understand patterns and regularities and relate the observations to established scientific knowledge. As a final function, the tool can also be used to select and export specific subsets of a given dataset for further analysis

Crossref

Springer - Publisher Connector

University of Birmingham Research Portal

Directory of Open Access Journals

PubMed Central

Gutenberg Open

Improving the diagnosis of heart failure in patients with atrial fibrillation.

Author: Bunting KV
Camm AJ
Gill SK
Gkoutos GV
Griffith M
Karwath A
Kirchhof P
Kotecha D
Lip GY
Mehta S
O'Connor K
Rahimi K
RAte control Therapy Evaluation in permanent Atrial Fibrillation
Sitch A
Stanbury M
Steeds RP
Strauss VY
Townend JN
Publication venue: 'BMJ'
Publication date: 01/01/2021
Field of study

OBJECTIVE: To improve the echocardiographic assessment of heart failure in patients with atrial fibrillation (AF) by comparing conventional averaging of consecutive beats with an index-beat approach, whereby measurements are taken after two cycles with similar R-R interval. METHODS: Transthoracic echocardiography was performed using a standardised and blinded protocol in patients enrolled in the RATE-AF (RAte control Therapy Evaluation in permanent Atrial Fibrillation) randomised trial. We compared reproducibility of the index-beat and conventional consecutive-beat methods to calculate left ventricular ejection fraction (LVEF), global longitudinal strain (GLS) and E/e' (mitral E wave max/average diastolic tissue Doppler velocity), and assessed intraoperator/interoperator variability, time efficiency and validity against natriuretic peptides. RESULTS: 160 patients were included, 46% of whom were women, with a median age of 75 years (IQR 69-82) and a median heart rate of 100 beats per minute (IQR 86-112). The index-beat had the lowest within-beat coefficient of variation for LVEF (32%, vs 51% for 5 consecutive beats and 53% for 10 consecutive beats), GLS (26%, vs 43% and 42%) and E/e' (25%, vs 41% and 41%). Intraoperator (n=50) and interoperator (n=18) reproducibility were both superior for index-beats and this method was quicker to perform (p<0.001): 35.4 s to measure E/e' (95% CI 33.1 to 37.8) compared with 44.7 s for 5-beat (95% CI 41.8 to 47.5) and 98.1 s for 10-beat (95% CI 91.7 to 104.4) analyses. Using a single index-beat did not compromise the association of LVEF, GLS or E/e' with natriuretic peptide levels. CONCLUSIONS: Compared with averaging of multiple beats in patients with AF, the index-beat approach improves reproducibility and saves time without a negative impact on validity, potentially improving the diagnosis and classification of heart failure in patients with AF

University of Liverpool Repository

University of Birmingham Research Portal

PubMed Central

Heart of England: HEFT Repository

VBN

Oxford University Research Archive

St George's Online Research Archive

HEFT Repository

Open Babel: An open chemical toolbox

Author: A Amini
A Andronico
A Bender
A Gakh
A Karwath
A Maunz
A Maunz
A Poater
A Rappe
AA Gakh
AD Hill
B-b Yan
BD McKay
C Helma
C Reynès
Chris Morley
CR Jacob
Craig A James
CW Bullock
D Filimonov
D Lagorce
D Lagorce
D Weininger
DC Bas
DC Lonie
DR Koes
F Fontaine
Geoffrey R Hutchison
GL Holliday
HL Morgan
I Wallach
I Wallach
IV Filippov
IV Tetko
J Ahmed
J Ahmed
J Kazius
J Myers
J Wang
J Wang
JH Chen
JJ Langham
JL Melville
JL Sharman
K Fogel
K Martin
L Fabian
L Liu
L Schietgat
M Brüstle
M Buehler
M Dehmer
M Konyk
M Krier
M Kuhn
MA Meineke
MA Miteva
Michael Banck
MJ Gómez
N O'Boyle
N Zonta
NM O'Boyle
NM O'Boyle
Noel M O'Boyle
O Sperandio
P Lind
P Murray-Rust
P Murray-Rust
P Murray-Rust
P Murray-Rust
P Rydberg
P Tosco
P Tosco
R Esposito
RA Bauer
RA Bauer
RS Armen
S Arbor
S Ingsriswang
SV Trepalin
T Cheng
T Halgren
T Halgren
T Halgren
T Halgren
T Halgren
T Kogej
T Pencheva
Tim Vandermeersch
TWH Backman
U Schmidt
VV Mihaleva
William H Green
X Jiang
X Wang
YD Paila
Z Huang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Background: A frequent problem in computational modeling is the interconversion of chemical structures between different formats. While standard interchange formats exist (for example, Chemical Markup Language) and de facto standards have arisen (for example, SMILES format), the need to interconvert formats is a continuing problem due to the multitude of different application areas for chemistry data, differences in the data stored by different formats (0D versus 3D, for example), and competition between software along with a lack of vendorneutral formats. Results: We discuss, for the first time, Open Babel, an open-source chemical toolbox that speaks the many languages of chemical data. Open Babel version 2.3 interconverts over 110 formats. The need to represent such a wide variety of chemical and molecular data requires a library that implements a wide range of cheminformatics algorithms, from partial charge assignment and aromaticity detection, to bond order perception and canonicalization. We detail the implementation of Open Babel, describe key advances in the 2.3 release, and outline a variety of uses both in terms of software products and scientific research, including applications far beyond simple format interconversion. Conclusions: Open Babel presents a solution to the proliferation of multiple chemical file formats. In addition, it provides a variety of useful utilities from conformer searching and 2D depiction, to filtering, batch conversion, and substructure and similarity searching. For developers, it can be used as a programming library to handle chemical data in areas such as organic chemistry, drug design, materials science, and computational chemistry. It is freely available under an open-source license fro

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

Irish Universities

PubMed Central

Cork Open Research Archive

Collaborative development of predictive toxicology applications

OpenTox provides an interoperable, standards-based Framework for the support of predictive toxicology data management, algorithms, modelling, validation and reporting. It is relevant to satisfying the chemical safety assessment requirements of the REACH legislation as it supports access to experimental data, (Quantitative) Structure-Activity Relationship models, and toxicological information through an integrating platform that adheres to regulatory requirements and OECD validation principles. Initial research defined the essential components of the Framework including the approach to data access, schema and management, use of controlled vocabularies and ontologies, architecture, web service and communications protocols, and selection and integration of algorithms for predictive modelling. OpenTox provides end-user oriented tools to non-computational specialists, risk assessors, and toxicological experts in addition to Application Programming Interfaces (APIs) for developers of new applications. OpenTox actively supports public standards for data representation, interfaces, vocabularies and ontologies, Open Source approaches to core platform components, and community-based collaboration approaches, so as to progress system interoperability goals

Queen's University Belfast Research Portal

Crossref

Springer - Publisher Connector

University of Birmingham Research Portal

Directory of Open Access Journals

Fraunhofer-ePrints

PubMed Central

DSpace at NTUA

IMT Institutional Repository

Gutenberg Open